Skip to content

Conversation

@srh
Copy link
Contributor

@srh srh commented Jul 9, 2025

…ministically

This orders columns in plan generation based on the order they're seen instead of using hash table ordering. Note that this affects internal plan nodes and does not change the output of any correctly-running queries. This has the effect of making query behavior deterministic and reproducible when investigating other bugs in query evaluation.

Check List

  • Tests have been run in packages where changes made if available
  • Linter has been run for changed code
  • Tests for the changes have been added if not covered yet
  • Docs have been added / updated if required

…ministically

This orders columns in plan generation based on the order they're seen
instead of using hash table ordering.  Note that this affects internal
plan nodes and does not change the output of any correctly-running
queries.  This has the effect of making query behavior deterministic
and reproducible when investigating other bugs in query evaluation.
@srh srh requested a review from waralexrom July 9, 2025 02:05
@srh srh requested a review from a team as a code owner July 9, 2025 02:05

#[derive(Default)]
struct ColumnRecorder<T: ColumnCollector> {
column_hash: HashSet<Column>,
Copy link
Member

@ovr ovr Jul 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srh, what do you think about using indexmap? Code will be much simpler than doing/separating it with HashSet + Vec<> to solve the ordering issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

@srh srh merged commit 08f21b6 into master Jul 9, 2025
53 checks passed
@srh srh deleted the fix-projection_above_limit-determinism branch July 9, 2025 18:25
Frank-TXS pushed a commit to Helge-TXS/cube that referenced this pull request Aug 5, 2025
cube-js#9766)

* fix(cubestore): Make projection_above_limit optimization behave deterministically

This orders columns in plan generation based on the order they're seen
instead of using hash table ordering.  Note that this affects internal
plan nodes and does not change the output of any correctly-running
queries.  This has the effect of making query behavior deterministic
and reproducible when investigating other bugs in query evaluation.

* chore(cubestore): Make projection_above_limit::ColumnRecorder use IndexSet
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants